Skip to content

populace-fit: weight-aware conditional models (regime-gated chained QRF)#2

Merged
MaxGhenis merged 11 commits into
mainfrom
fit-kernel
Jun 10, 2026
Merged

populace-fit: weight-aware conditional models (regime-gated chained QRF)#2
MaxGhenis merged 11 commits into
mainfrom
fit-kernel

Conversation

@MaxGhenis

@MaxGhenis MaxGhenis commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

The conditional-models operator — populace.fit, the second shard of the stack (after populace-frame). Per DESIGN.md ("populace-fit: conditional models").

What's here

  • ConditionalModel / FittedModel protocols (model.py) — fit(frame, predictors, targets, *, weights="design") -> FittedModel; FittedModel.predict(frame_or_df) -> DataFrame (one draw column per target). Weight-aware by construction: weights selects which typed weight vector of the owning entity to use (default that entity's design weights), reading the Frame's typed Weights rather than a raw array. weights="none" is the only way to fit unweighted, and a misspelled or mismatched kind raises (naming the culprit) instead of silently falling back to unweighted — closing the 2026-06 microimpute landmine at the type boundary. resolve_fit_weights is the single authority for this rule.
  • QRF / RegimeGatedQRF (qrf.py) — the canonical model: regime-gated (structural, unweighted sign-mixture gates), sequentially chained (each target conditions on the predictors plus the targets already drawn), quantile-regression-forest draws (a seeded per-row quantile), with the frame's weights materialized by weighted bootstrap (importance-resample the training rows by weight before growing each forest). Reimplemented from scratch against the Frame — it does not import microimpute. This is the microimpute#196 fix as the reference mechanism.
  • Import-time kernel-compat assert (__init__.py) — the charter's constellation-versioning mechanism: populace-frame's series is checked at import (pre-1.0, the 0.x minor; major from 1.0 on), so a loose resolver that ignores [tool.uv.sources] cannot silently assemble an incompatible kernel pair.

The headline contract

test_weighted_fit_shifts_draws_toward_weighted_truth is the real realization of the placeholder the kernel left skipped in packages/populace-frame/tests/test_contracts.py. On a donor whose target is large exactly where weight is small (the #196 shape; the high-value regime is independent of the predictors, so honoring the weight is the only way to recover the weighted conditional):

  • (a) the weighted fit's draws' mean lands within 20% of the true weighted mean;
  • (b) weights="none" lands within 20% of the unweighted mean;
  • (c) the two differ by >3x.

This is the microimpute#196 bug class — now a standing guarantee of the stack rather than a latent footgun. Follow-up: the kernel's skipped test_weighted_fit_shifts_draws_toward_weighted_truth placeholder can be unskipped/removed once this shard is in the workspace (it could not be edited from this branch's scope).

Plus: regime gates preserve a zero-inflated target's zero mass and both signs (no zero-crossing); chaining reproduces a cross-target correlation; predict row-count/index match the input; fixed seed is deterministic; weights="none" is the only unweighted path. 35 tests, n=5000 seeded for CI speed.

Note on the scikit-learn pin

scikit-learn is capped >=1.5,<1.9. scikit-learn 1.9 removed sklearn.tree._tree.DTYPE, which quantile-forest imports — so an unbounded >=1.5 resolves to 1.9 and import quantile_forest fails. On the workspace's Python 3.14 interpreter the cap keeps the only working combination (scikit-learn 1.8 + quantile-forest 1.4) resolvable; the cap can be lifted once quantile-forest tracks the 1.9 tree ABI.

Optional heavy deps (scikit-learn, quantile-forest) stay in this shard, never in populace-frame.

Validation

uv sync --all-packages && uv run pytest packages/populace-fit && uv run ruff check packages/populace-fit — all green (35 passed; ruff clean). Full workspace suite: 192 passed, 3 skipped (the microunit/policyengine_us-gated kernel tests).

🤖 Generated with Claude Code

MaxGhenis and others added 3 commits June 10, 2026 10:02
Add the packages/populace-fit shard skeleton: src-layout PEP 420 namespace (no src/populace/__init__.py), its own pyproject (deps populace-frame + scikit-learn + quantile-forest + numpy + pandas), and [tool.uv.sources] populace-frame = workspace so the workspace resolves it locally.

scikit-learn is capped <1.9: 1.9 removed sklearn.tree._tree.DTYPE, which quantile-forest imports, breaking import on the only Python-3.14 wheel set.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…d QRF

model.py: the ConditionalModel/FittedModel protocols and resolve_fit_weights — the single authority enforcing that a fit is weighted by construction (weights reads the owning entity's typed Weights; weights='none' is the only unweighted path; a misspelled or mismatched kind raises rather than silently fitting unweighted). predictors_targets_entity refuses predictors/targets that span entities.

qrf.py: the canonical model. Regime detection (structural/unweighted sign-support gates), sequential chaining (each target conditions on predictors plus the targets already drawn), and weighted bootstrap (importance-resample rows by weight before growing each forest — the microimpute#196 fix, reimplemented from scratch against the Frame, not imported). Draws sample the weighted conditional by querying the forest at a per-row seeded quantile.

__init__.py: public API (ConditionalModel, QRF/RegimeGatedQRF, fit) and the constellation compatibility gate — asserts populace-frame's major/minor at import so a loose resolver cannot assemble an incompatible kernel pair.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…d placeholder)

test_weighted_fit_contract.py is the real realization of the kernel's skipped test_weighted_fit_shifts_draws_toward_weighted_truth: on a donor whose target is large exactly where weight is small (the #196 shape, ~20% low-weight huge-value rows), the weighted fit's draws land within 20% of the true weighted mean, weights='none' lands within 20% of the unweighted mean, and the two differ by >3x. Also asserts the default is weighted (no unweighted default).

test_qrf.py: regime gates preserve a zero-inflated target's zero mass and both signs with no zero-crossing; chaining reproduces a cross-target correlation; predict row-count/index match the input; fixed seed is deterministic; successive predicts draw independently.

test_model.py: weights='none' is the only unweighted path (a typo'd kind raises and names it; a mismatched kind raises and names the stored kind); predictors/targets spanning entities are refused. test_compat.py exercises the import gate.

n=5000 seeded for CI speed; 35 tests.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@MaxGhenis MaxGhenis marked this pull request as draft June 10, 2026 08:45
@MaxGhenis

Copy link
Copy Markdown
Contributor Author

Adversarial review — converting to draft; soundness fixes needed before merge

An independent clean-room review (every finding reproduced) found bugs in the operator's core purpose. Marking draft. Ranked:

HIGH — the weighted-bootstrap gate annihilates rare low-weight classes. The gate is fit on an n-of-n weighted resample, so a scarce-but-oversampled stratum (exactly what the charter's pool design carries: oversampled in rows, downweighted in mass) gets zero rows in the resample → single-class gate → that class drawn with probability 0, forever. Reproduced: 9/10 gates single-class, 0 positive draws in 2M rows vs ~80 expected. The fit re-rarefies what the pool deliberately oversampled — the opposite of "tail support is strata's job." Fix: HistGradientBoostingClassifier honors sample_weight exactly; fit the gate weighted, drop its bootstrap (the forests still need the bootstrap — QRF ignores weight magnitude, confirmed at _quantile_forest.py:266).

HIGH — household-weights-only frames can't be fit weighted at all. The canonical CPS shape (person-level fit, household design weights) fails weights_for("person"); only weights="none" runs. The operator built to prevent unweighted-by-accident makes the unweighted escape hatch the only thing that runs on the most representative input — microimpute#196's social mechanism, rebuilt. Fix: resolve through effective (broadcast) weights — touches kernel API (_effective_weights is private; accounting already uses it, so wmean is weighted on a frame where fit refuses to be).

HIGH — NaN targets silently become zeros in gated regimes (NaN-blind sign labels → zero class). Survey item-nonresponse silently moves mass to $0. Fix: validate finite at fit, raise naming the column + count.

MEDIUM — tail mass undershot ~2x, and the contract test passes for the wrong reason ("delete all low-weight rows" also passes — it asserts only means, never that the rare regime survives). Plus the 201-point grid winsorizes draws to [0.5%, 99.5%]. This is the capital-gains/dividends tail problem at the method level. Fix: assert high-draw-share survival, draw leaf values as atoms incl. endpoints.

MEDIUM — fit and draw RNG share one stream (predict quantiles bit-identical to the gate's bootstrap uniforms; max|diff|=0). Fix: SeedSequence(seed).spawn(2).

LOW — missing populace-frame>=0.1,<0.2 pin (charter mandate; gate works but resolution-time failure is the better failure); impossible-remediation error message; duplicate-target / target-in-predictors unguarded; (n×201) memory at design scale.

Sound, must not regress: the weighted factorization P(sign|x)·P(y|x,sign); chained-equations semantics (drawn values fed forward); resolve_fit_weights as the single enforcement point; the compat gate; determinism. The headline contract does kill the literal #196 bug — it just can't see tail mass, rare-class survival, or the household-weighted frame it never builds.

MaxGhenis and others added 8 commits June 10, 2026 11:48
Add Frame.resolve_weights(entity) -> Weights: resolves effective weights
like _effective_weights but returns a typed Weights that carries the
source entity's kind. An entity without its own stored weights inherits
the single weighted group entity's design/importance/calibrated kind and
broadcast values; an entity with its own weights is returned as-is. The
existing ambiguity guards (zero/multiple weighted group entities) are
kept.

This fixes the "household-weighted frame can't be fit weighted" bug: a
person-level fit can now read the inherited household kind instead of a
bare ndarray that dropped the kind. accounting._resolve migrates to
resolve_weights(owner).values (behavior identical).

Regression tests (test_bundle.py TestResolveWeights): person resolve on a
household-weighted frame returns Weights(kind=design, broadcast values);
calibrated household resolves to calibrated person; an entity with its own
weights returns that exact object; ambiguity (two weighted group entities)
still raises; unknown entity is named.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
resolve_fit_weights now resolves via frame.resolve_weights(entity) rather
than frame.weights_for(entity), so a person-level fit on a
household-weighted frame inherits the household weights (and their kind)
through membership instead of raising. This was the bug: the canonical CPS
shape (person predictors/targets, design weights only on the household)
could not be fit weighted at all.

The kind discipline is unchanged — the requested kind must match the
resolved (possibly inherited) kind, else raise. The impossible-remediation
message is fixed: requesting "design" on a calibrated frame no longer
advises "advance the frame's weights to design" (kinds only move forward,
so that is impossible); it now tells the caller to pass
weights="calibrated", the kind the frame actually carries. The forward
direction (e.g. requesting calibrated on a design frame) keeps the
advance-the-weights advice.

Regression tests (test_model.py): the CPS shape fits weighted and the
resolved vector broadcasts the household weights onto persons; a default
design fit on a calibrated frame raises naming weights="calibrated" and
not the impossible advance-to-design advice. The existing kind-mismatch
test now matches "resolved weights" wording.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The sign-class gate (HistGradientBoostingClassifier) was fit on an n-of-n
weighted bootstrap, which deletes vanishingly-rare low-weight classes
outright: a positive row at weight 1 among thousands of zeros at weight 50
is drawn with probability ~4e-5, so the resampled labels routinely contain
only the zero class and the gate could never draw the positive sign (0
positive draws across millions, reproduced).

HistGradientBoostingClassifier honors sample_weight exactly, so the gate
is now fit with sample_weight=weights directly, no bootstrap. Every
training row is present, so every sign class the data contains survives
into classes_. The weighted bootstrap stays for the QRF forests, which
genuinely need it: quantile-forest uses sample_weight only as a >0 leaf
mask (confirmed _quantile_forest.py:266), so it ignores weight magnitude
and the resample is the only way to weight the leaf distributions.

A guard now enforces internal consistency: if a sign class present in the
training labels is absent from the fitted gate's classes_, the fit raises
rather than silently drawing that class at probability zero.

Regression tests (test_qrf.py): the reviewer's repro (n=5000, ~10 positive
at weight 1, ~4990 zero at weight 50) keeps both gate classes and produces
positive draws across seeds (was 0/2M); the consistency guard raises when
a stubbed gate drops a training class.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ndpoints, chunking)

Four changes to how the quantile forest is grown and read out, so draws
reproduce the conditional's tail instead of undershooting it ~2x:

1. max_samples_leaf=None. The forests are now grown keeping ALL leaf
   samples; quantile-forest's default of 1 keeps one sample per leaf,
   thinning each row's conditional to ~n_estimators atoms and undershooting
   tail mass. Exposed as a RegimeGatedQRF param (default None). On the
   contract fixture the weighted share above 300k goes from ~0.0035
   (nearest-snap, msl=1) to ~0.0050 — the weighted-population truth.

2. Linear interpolation. draw() no longer snaps each row to the nearest of
   201 grid points (which quantizes every draw and biases the tail toward
   the bracket interior); it linearly interpolates the row's value at its
   exact quantile between the two bracketing grid quantiles.

3. Drawable extremes. The quantile grid now includes points adjacent to 0
   and 1, so the observed conditional min and max are drawable. q=1 is the
   observed maximum, not extrapolation — the old comment wrongly excluded
   it. With a lone extreme the interior-only grid (top q=0.995) reads far
   below the max; the endpoint reaches it.

4. Chunked predict. draw() batches the predict over rows
   (_PREDICT_CHUNK_ROWS=50k) so the (n_rows x n_grid) matrix never
   materializes whole — at 3M+ rows it would be tens of GB. Chunking is
   bit-identical to a single pass (quantiles are drawn up front and sliced
   positionally).

Regression tests (test_qrf.py): the weighted tail share above 300k is
within ~2x of truth and materially closer than the nearest-snap baseline;
a draw at q->1 reaches the observed conditional max via the grid endpoint,
which the interior-only winsorized grid misses.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
NaN targets were silently relabeled to the zero class: the sign labels
(y > atol / y < -atol) are both False for NaN, so a missing value was
miscoded as a structural zero, NaN-blind. The model has no notion of
missingness, so fit now validates at entry that every target column is
entirely finite and raises a ValueError naming the offending column and
its non-finite count (NaN or inf). Predictors are not checked — a forest
splits around NaN features and a missing predictor is not silently
miscoded the way a missing target is.

Regression tests (test_qrf.py): a target with 3 NaNs raises naming the
column and the count; an inf target is refused the same way.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…tract

Finding F: the fitted model was seeded with the raw model seed, making its
draw uniforms bit-identical to the fit's bootstrap-selection uniforms (the
draws were not independent of the fit's resampling). Seed fit and draw from
two independent SeedSequence children of the model seed; determinism is
preserved. Finding H: add a contract that the zero gate reproduces the
*weighted* (population) zero-share, not the sample's, when the two differ.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…olumns

Finding G: pin populace-frame>=0.1,<0.2 (the constellation must resolve, not
fail only at the import-time compat gate), and refuse duplicate predictors,
duplicate targets, or a column that is both predictor and target (these
silently fit twice / fit P(y|y) before). Messages name the culprits.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ification review)

Two independent verification reviews (mutation testing) found that the
resolve_weights kind fix introduced two bugs in _inherited_kind, which
only handled the weighted-group path while _effective_weights gives
person-stored weights precedence when deriving a group entity:

1. Regression: weighted accounting (wsum/wmean/wquantile/gini/...) of a
   group-entity column on a person-only-weighted frame raised instead of
   deriving the group weights from the person weights — on a frame shape
   the fit suite's own fixtures build.
2. Silent kind mislabel: a third entity's resolved values (from the
   person source) were tagged with a sibling group's kind, which could
   leak through resolve_fit_weights as a kind-discipline violation.

Make _inherited_kind recurse exactly as _effective_weights does so kind
always names the source the values come from. Three regression tests:
person-only group accounting, mixed-kind coherence, and a leaf-component
pin for the tail-draw fix (max_samples_leaf=None was untested).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@MaxGhenis MaxGhenis marked this pull request as ready for review June 10, 2026 14:49
@MaxGhenis MaxGhenis merged commit d4dc9df into main Jun 10, 2026
2 checks passed
@MaxGhenis MaxGhenis deleted the fit-kernel branch June 10, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant